Co-Multistage of Multiple Classifiers for Imbalanced Multiclass Learning

نویسندگان

  • Luís Marujo
  • Anatole Gershman
  • Jaime G. Carbonell
  • David Martins de Matos
  • João Paulo da Silva Neto
چکیده

In this work, we propose two stochastic architectural models (CMC and CMC-M ) with two layers of classifiers applicable to datasets with one and multiple skewed classes. This distinction becomes important when the datasets have a large number of classes. Therefore, we present a novel solution to imbalanced multiclass learning with several skewed majority classes, which improves minority classes identification. This fact is particularly important for text classification tasks, such as event detection. Our models combined with preprocessing sampling techniques improved the classification results on 6 well-known datasets. Finally, we have also introduced a new metric SG-Mean to overcome the multiplication by zero limitation of G-Mean.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Polichotomies on Imbalanced Domains by One-per-Class Compensated Reconstruction Rule

A key issue in machine learning is the ability to cope with recognition problems where one or more classes are under-represented with respect to the others. Indeed, traditional algorithms fail under class imbalanced distribution resulting in low predictive accuracy over the minority classes. While large literature exists on binary imbalanced tasks, few researches exist for multiclass learning. ...

متن کامل

Recognition of Multiple Imbalanced Cancer Types Based on DNA Microarray Data Using Ensemble Classifiers

DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce i...

متن کامل

Multistage SVM as a Clinical Decision Making Tool for Predicting Post Operative Patient Status

Because applying machine learning techniques in support of clinical decision would improve decision makers in healthcare, we present in this paper a comparative framework of Support Vector Machine (SVM) classifiers based on post operative patient (POP) data. We compare the performance of a single multiclass SVM and a multistage SVM (MSVM) to those obtained by a number of other classifiers prese...

متن کامل

Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining

This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...

متن کامل

On dynamic ensemble selection and data preprocessing for multi-class imbalance learning

Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the majority class which has a large number of instances. Ensemble of classifiers have been reported to yield promising results. However, the majority of ensemble metho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1312.6597  شماره 

صفحات  -

تاریخ انتشار 2013